Building Trustable Enterprise Data Science Products

This article was published in December 2018 on Kernel, the blog of Rubikloud, where I worked at the time. Rubikloud has since been acquired by Kinaxis.

At Rubikloud, we deliver Intelligent Decision Automation for the Enterprise. We have two products; Price & Promotion Manager, which delivers automated mass promotional demand forecasting at chain, store, and product levels, and Customer LifeCycle Manager (CLCM), which leverages Data Science to understand customers and automatically generate curated experiences across various channels and customer touchpoints.

Screen-Shot-2018-12-06-at-1.23.59-PM-1024x453.png — Make it stand out

Whatever it is, the way you tell your story online can make all the difference.

Our models live in a rich ecosystem as shown in the above diagram. The type of Product Management we need to make these products successful requires strength in and management of:

The Retail business: Merchandising for Promotion Manager and Marketing/CRM for CLCM. Merchandising departments plan and develop strategies to enable a retailer to sell a range of products en masse to deliver sales and profit targets. Marketing departments are responsible for managing a retailer’s brand and conducting campaigns that serve the retailer’s marketing initiatives.
Enterprise Services: What it takes to roll out Enterprise products that integrate with systems and processes on the client side.
Data Science: How to productize a model and translate results to business impact.
Software Engineering: How to build reliable scalable infrastructure and workflows to support our models and the huge amounts of data we ingest and use.

Fig 2: Aspects of Rubikloud Product Management

You can start to get an idea of how unique the products we build are. Most software products out there have only a couple of the above aspects. Each of these different aspects warrants a blog post of its own, so for the rest of this post, I’ll focus on an aspect of building Data Science products; building products that users trust.

Building Trustable Data Science Products

As Data Science permeates businesses more and more, data science players and business decision makers are finding ways to integrate and adopt data science. Topics such as Model Explainability and “Human-in-the-loop” approaches are being discussed; the reason being that a big part of achieving Data Science adoption in the Enterprise is making sure that users have a basic understanding of the Data Science system/product, and trust it, versus think of it as a black box.

For Rubikloud, the products we build make predictions and forecasts that influence business decisions on our clients’ side. The processes we affect, be it sales forecasting or marketing management, are well-established in most retailers. It is not easy to ask business users to “trust the machine” for decisions they have made themselves for the longest time, even if these decisions are only based on personal experience and heuristics.

From a product management standpoint, we are cognizant of this challenge. To lay the foundation, our products are meant to aid business decision making, and not replace the human factor altogether. Our products are essentially systems of insights and predictions that are relevant to and integrated with the business of our clients, and that includes features specifically designed with levers that users can control.

Here are some of these features:

1. Business Rules

One of the reasons why business users sometimes find it hard to ‘let go’ of decision making is because retailers make a lot of adjustments and exceptions on the level of an individual campaign or a sales forecast. We have developed a Linear Programming module that allows us to optimize for a certain business objective while honouring constraints that we expose to the user as business rules. Users can turn on and off and whose thresholds they can configure through an intuitive UI that uses terms the business users are familiar with.

2. Sanity Validations

The outputs generated by our systems typically need to be verified and validated. Verification is a fairly straightforward technical undertaking that we have automated, and typically includes verifying whether the number of predictions generated are within a certain range.

Validating outputs is not as straightforward, because you want to make sure the outputs “make sense” from a business perspective. We have developed modules that intelligently carry out validations by automating what a human data analyst would do to validate results, such as checking that a certain output follows a certain distribution.

We call these “Sanity Validations” and have also exposed them in the UI so users can turn them on/off and configure warning and severe thresholds. With each iteration of output generation, the results of these validations are sent to users, who can make an educated decision about the quality of the output. Our Data Analytics team had a lot of input in the development of these modules, as the team houses a mix of business and technical experience.

3. Output Sample Review

Where relevant, we allow our users to review a random sample of the outputs, accompanied by contextual information that allows them to assess its quality. We also provide general statistics on the outputs so clients can make sure the outputs meet their expectations. This is particularly important for outputs that are “hard to assess” such as the outputs of recommendation systems.

4. Override Model Outputs

We know that in some cases the business users will have knowledge that’s not translatable to data, or at least data that we ingest regularly. In these cases, the users will want to override a prediction that our algorithms have made. This may happen either if the user “feels” the prediction is incorrect, or if the user has information, outside the product, that affects this prediction. We provide features for the users to do that in our interface, whereby the users can choose to override a specific prediction. We have found that, with time, as our models improve in performance, business users trust the machine more, and the use of override features declines.

Fig 3: Use of the ‘Override’ feature over time

5. Users affecting model parameters

We use a probabilistic approach for certain predictions our products make, and our systems suggest best practices for these approaches. As our products matured, we exposed in our UI options for power users to control some parameters such as safety scores and confidences based on their knowledge of external events. And in order for the user to do this in an educated way, we provide them with context on how a certain prediction compares to the history of this predicted event.

A summary of my tips for the topic would be:

When building Data Science products, avoid building a black box product and design for features that allow domain experts to affect results.
Spend as much time making data science models user-centric as you do making models elegant, if not more.
It takes many iterations to get the marriage of data science and business right. Plan with that in mind.

You can read more about our Data Science and Product & Analytics work.

Related topics are the blog post “AI: The Next Evolution of Automation” by our Chief Data Scientist, Brian Keng, about automating AI systems.

And the blog post “Sheltering Models: Machine Learning Engineering as a Gradual Need” by our Data Science Manager, Javier Moreno, for more about our RkLuigi library that we developed to plumb production-grade machine learning systems with tremendous amounts of data at the core.

Building Trustable Data Science Products

Are product managers consensus-builders?